# Network Type and Depression Analysis

## Overview

This Stata script analyzes the relationship between network typologies and depression using longitudinal data from two waves.
The script performs data cleaning, variable construction, factor analysis, and fixed-effects regression models with inverse probability weighting (IPW).

**Main dofile:** `dofile_for_type_and_depression_1010.do`

## Requirements

**Stata Version:** 14.0 or higher

**Required Packages:**
```stata
ssc install asdoc
ssc install outreg2
ssc install estout
```

## Input Data

1. `cov_netps_nettype_0612.dta` - Main analytical dataset from R clustering scripts

This file should be the output from the R clustering scripts (specifically from `2_Cluster_Algorithm.R` or `3_Cluster_Algorithm_Simple.R`). It must contain:

**Network variables:** Cluster assignments (`cluster_ID`), kin/non-kin support measures, network satisfaction

**Demographic variables:** `egoid`, `wave`, `gender`/`gender1`, `single`/`single1`, `edu`, `age`, `collegetype`

**Well-being measures:** CES-D components (`sad`, `frustrated`, `depressed`, `hardTOdo`, `sleep`, `happy`, `concentrate`, `live_happy`, `bothering`)

**Social support:** `MeetFrd`, `MeetHangout`, `MeetHelp`, `net_satisfaction`/`net_satis`

**Pandemic variables:** `quarantine`/`quarantine1`

**Other scales:** `sss` (socioeconomic status)

2. `oxford_social_distancing_data.dta` - Oxford COVID-19 Government Response Tracker data

This file contains pandemic policy and severity measures that need to be merged with the main dataset:

**Policy variables:** `openness` (social distancing policy stringency)

**Severity variables:** `pro_confirmed` (provincial confirmed COVID-19 cases)

## Setup

Before running the script, update the working directory on line 11:

```stata
cd "your/path/to/data/folder"
```

The script will automatically create two subdirectories:
- `output/` - For data files and tables
- `figure/` - For graphs and plots

## Script Structure

The analysis is organized into four sequential parts:

### Part A: Data Cleaning and Recoding

Performs initial data preparation:
- Recodes partnership status (`single1` → `single`)
- Recodes quarantine status (`quarantine1` → `quarantine`)
- Constructs CES-D depression scale
- Removes unstable clusters (error > 20%, typically cluster 6)
- Identifies respondents with both waves of data
- Standardizes education variable
- Creates lagged health variable

**Output:** `output/covnetps_nettype_cleaned.dta`

### Part B: Data Restructuring

Creates multiple dataset versions:
- Wave 1 only dataset
- Wave 2 only dataset
- Wide-format dataset with network type change indicator
- Merged dataset with type change information
- Pandemic-related variables (log-transformed confirmed cases)

**Outputs:**
- `output/covnetps_nettype_cleaned_w1.dta`
- `output/covnetps_nettype_cleaned_w2.dta`
- `output/changetype.dta`

### Part C: Main Analysis

Conducts primary statistical analyses:

1. **Network Type Labeling:** Assigns descriptive labels to clusters (Family, Friend, Restricted, Family&Community, School&Career, Homebody, JustActivity)

2. **Factor Analysis:** Tests one-factor vs. two-factor models of social support
   - Objective support: exchange, personal, help, entertainment
   - Subjective support: MeetFrd, MeetHangout, MeetHelp

3. **Network Type Ranking:** Creates support ranking based on objective and subjective dimensions

4. **Descriptive Statistics:** Generates Table 2 with demographic characteristics by network type

5. **Multinomial Logit:** Predicts network type membership from individual characteristics

6. **IPW Fixed-Effects Models:** Main results examining network type and depression
   - Baseline FE model
   - Interaction model (network type × SES)
   - Ranked support models

**Outputs:**
- `output/Table2_simple.doc` - Descriptive statistics
- `output/mainresults_mlogit.rtf` - Multinomial logit results
- `output/mainresults_fe_interact.rtf` - Fixed-effects models
- `figure/support_scatter_quadrants.pdf` - Network type support profile
- `figure/mlogit_predict_nettype.pdf` - Predicted probabilities
- `figure/marginsplot_interaction_effect.pdf` - Interaction effects
- `output/complementary_analysis.dta` - Dataset for supplementary analyses

### Part D: Complementary and Robustness Checks

Performs sensitivity analyses:
- Restricts to accurately classified clusters
- Tests full unbalanced panel
- Uses ranked support measure
- Hausman test for FE vs RE specification
- Attrition analysis comparing wave 1 completers vs. dropouts

**Outputs:**
- `output/complimentary_fe1.rtf` - Robustness check results
- `output/ttest_by_bothwave_wave1.xlsx` - Attrition t-tests
- `output/attrition_tabs.doc` - Attrition cross-tabs

## Running the Script

### Full Analysis

Run the entire script:
```stata
do "dofile_for_type_and_depression_1010.do"
```

### Running by Sections

To run specific parts, execute sequentially:

```stata
* Part A only
do "dofile_for_type_and_depression_1010.do" if _n <= 56

* Parts A & B
[Run through line 100]

* Parts A, B & C
[Run through line 383]

* Complete analysis (all parts)
[Run entire script]
```

**Note:** Part B requires Part A to be completed first. Part C requires Parts A and B. Part D requires all previous parts.

## Key Variables Created

**Depression measure:**
- `ces_d` - Sum of 9 CES-D items (range 0-54, higher = more depressed)

**Network type:**
- `cluster_ID` - Network typology assignment (unstable clusters set to missing)
- `NETTYPEID` - Labeled network types

**Support measures:**
- `support_n` - Total objective support count
- `support_scale` - Scaled objective support (0-8)
- `net_sat_sd` - Standardized network satisfaction
- `rank_twodimension` - Combined objective/subjective support ranking

**Panel structure:**
- `bothwave` - Indicator for participating in both waves
- `change_type` - Network type changed between waves (1=yes, 0=no)

**IPW weights:**
- `sw` - Stabilized weights for attrition
- `phat` - Predicted probability of wave 2 participation

## Output Files

**Data files:**
- `covnetps_nettype_cleaned.dta` - Main analytical dataset
- `covnetps_nettype_cleaned_w1.dta` - Wave 1 data only
- `covnetps_nettype_cleaned_w2.dta` - Wave 2 data only
- `changetype.dta` - Network type transitions
- `complementary_analysis.dta` - Robustness check dataset

**Tables:**
- `Table2_simple.doc` - Descriptive statistics by network type
- `mainresults_mlogit.rtf` - Multinomial regression results
- `mainresults_fe_interact.rtf` - Fixed-effects regression results
- `complimentary_fe1.rtf` - Robustness checks
- `attrition_tabs.doc` - Attrition analysis tables
- `ttest_by_bothwave_wave1.xlsx` - Attrition t-tests

**Figures:**
- `support_scatter_quadrants.pdf` - Network types by support dimensions
- `mlogit_predict_nettype.pdf` - Predicted network type probabilities
- `marginsplot_interaction_effect.pdf` - Network type × SES interaction

**Log file:**
- `log_file_nettype.smcl` - Complete execution log

## Troubleshooting

### File not found
Verify the working directory is set correctly (line 11) and that `cov_netps_nettype_0612.dta` is in that directory.

### Missing variables
The input data must come from the R clustering analysis. Ensure you've run either `2_Cluster_Algorithm.R` or `3_Cluster_Algorithm_Simple.R` first and saved the `.dta` output.

### Package not installed
Install missing packages using `ssc install package_name`. If a package is unavailable, check for alternatives or updated package names.

### Cluster 6 missing warning
This is expected. The script sets cluster 6 to missing because it has >20% prediction error (unstable). Adjust line 41 if your clustering results identify a different unstable cluster.

